Overview

Brought to you by YData

Dataset statistics

Number of variables 10
Number of observations 194054
Missing cells 0
Missing cells (%) 0.0%
Duplicate rows 0
Duplicate rows (%) 0.0%
Total size in memory 14.8 MiB
Average record size in memory 80.0 B

Variable types

Text 3
Categorical 1
Numeric 6

Alerts

actual is highly overall correlated with prediction and 1 other fields High correlation
prediction is highly overall correlated with actual High correlation
fdr is highly overall correlated with f_statistic and 1 other fields High correlation
f_pvalue is highly overall correlated with fdr and 1 other fields High correlation
f_statistic is highly overall correlated with fdr and 1 other fields High correlation
residual is highly overall correlated with actual High correlation

Reproduction

Analysis started 2025-04-28 13:39:58.422992
Analysis finished 2025-04-28 13:40:08.115629
Duration 9.69 seconds
Software version ydata-profiling vv4.16.1
Download configuration config.json

Variables

sample
Text

Distinct 314
Distinct (%) 0.2%
Missing 0
Missing (%) 0.0%
Memory size 1.5 MiB
2025-04-28T20:40:08.439924 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 9
Median length 9
Mean length 7.6552609
Min length 4

Characters and Unicode

Total characters 1485534
Distinct characters 16
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row GSM228562
2nd row GSM228562
3rd row GSM228562
4th row GSM228562
5th row GSM228562
Value Count Frequency (%)
hyb2 1048
 
0.5%
hyb18 1048
 
0.5%
hyb19 1048
 
0.5%
hyb12 1048
 
0.5%
hyb11 1048
 
0.5%
hyb10 1048
 
0.5%
hyb1 1048
 
0.5%
hyb16 1048
 
0.5%
hyb15 1048
 
0.5%
hyb14 1048
 
0.5%
Other values (304) 183574
94.6%
2025-04-28T20:40:09.204968 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
2 286120
19.3%
8 135458
9.1%
M 131174
8.8%
S 131174
8.8%
G 131174
8.8%
6 112654
 
7.6%
5 93480
 
6.3%
4 66500
 
4.5%
H 62880
 
4.2%
y 62880
 
4.2%
Other values (6) 272040
18.3%

Most occurring categories

Value Count Frequency (%)
(unknown) 1485534
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
2 286120
19.3%
8 135458
9.1%
M 131174
8.8%
S 131174
8.8%
G 131174
8.8%
6 112654
 
7.6%
5 93480
 
6.3%
4 66500
 
4.5%
H 62880
 
4.2%
y 62880
 
4.2%
Other values (6) 272040
18.3%

Most occurring scripts

Value Count Frequency (%)
(unknown) 1485534
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
2 286120
19.3%
8 135458
9.1%
M 131174
8.8%
S 131174
8.8%
G 131174
8.8%
6 112654
 
7.6%
5 93480
 
6.3%
4 66500
 
4.5%
H 62880
 
4.2%
y 62880
 
4.2%
Other values (6) 272040
18.3%

Most occurring blocks

Value Count Frequency (%)
(unknown) 1485534
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
2 286120
19.3%
8 135458
9.1%
M 131174
8.8%
S 131174
8.8%
G 131174
8.8%
6 112654
 
7.6%
5 93480
 
6.3%
4 66500
 
4.5%
H 62880
 
4.2%
y 62880
 
4.2%
Other values (6) 272040
18.3%

nichd
Categorical

Distinct 5
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 1.5 MiB
middle_childhood
70156 
early_childhood
56686 
early_adolescence
54012 
toddler
10056 
infancy
 
3144

Length

Max length 17
Median length 16
Mean length 15.37402
Min length 7

Characters and Unicode

Total characters 2983390
Distinct characters 16
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row early_adolescence
2nd row early_adolescence
3rd row early_adolescence
4th row early_adolescence
5th row early_adolescence

Common Values

Value Count Frequency (%)
middle_childhood 70156
36.2%
early_childhood 56686
29.2%
early_adolescence 54012
27.8%
toddler 10056
 
5.2%
infancy 3144
 
1.6%

Length

2025-04-28T20:40:09.392520 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T20:40:09.534518 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Value Count Frequency (%)
middle_childhood 70156
36.2%
early_childhood 56686
29.2%
early_adolescence 54012
27.8%
toddler 10056
 
5.2%
infancy 3144
 
1.6%

Most occurring characters

Value Count Frequency (%)
d 468120
15.7%
l 371764
12.5%
e 352946
11.8%
o 317752
10.7%
h 253684
8.5%
c 238010
8.0%
i 200142
6.7%
_ 180854
 
6.1%
a 167854
 
5.6%
r 120754
 
4.0%
Other values (6) 311510
10.4%

Most occurring categories

Value Count Frequency (%)
(unknown) 2983390
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
d 468120
15.7%
l 371764
12.5%
e 352946
11.8%
o 317752
10.7%
h 253684
8.5%
c 238010
8.0%
i 200142
6.7%
_ 180854
 
6.1%
a 167854
 
5.6%
r 120754
 
4.0%
Other values (6) 311510
10.4%

Most occurring scripts

Value Count Frequency (%)
(unknown) 2983390
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
d 468120
15.7%
l 371764
12.5%
e 352946
11.8%
o 317752
10.7%
h 253684
8.5%
c 238010
8.0%
i 200142
6.7%
_ 180854
 
6.1%
a 167854
 
5.6%
r 120754
 
4.0%
Other values (6) 311510
10.4%

Most occurring blocks

Value Count Frequency (%)
(unknown) 2983390
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
d 468120
15.7%
l 371764
12.5%
e 352946
11.8%
o 317752
10.7%
h 253684
8.5%
c 238010
8.0%
i 200142
6.7%
_ 180854
 
6.1%
a 167854
 
5.6%
r 120754
 
4.0%
Other values (6) 311510
10.4%

probe
Text

Distinct 959
Distinct (%) 0.5%
Missing 0
Missing (%) 0.0%
Memory size 1.5 MiB
2025-04-28T20:40:09.902488 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 12
Median length 9
Mean length 9.9364095
Min length 6

Characters and Unicode

Total characters 1928200
Distinct characters 16
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 1431_at
2nd row 1494_f_at
3rd row 177_at
4th row 200642_at
5th row 200697_at
Value Count Frequency (%)
206094_x_at 1773
 
0.9%
215125_s_at 1773
 
0.9%
208596_s_at 1773
 
0.9%
207126_x_at 1182
 
0.6%
204532_x_at 1182
 
0.6%
221305_s_at 985
 
0.5%
221304_at 985
 
0.5%
222094_at 788
 
0.4%
233334_x_at 788
 
0.4%
218021_at 394
 
0.2%
Other values (949) 182431
94.0%
2025-04-28T20:40:10.434853 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
2 288499
15.0%
_ 281656
14.6%
a 195894
10.2%
t 194054
10.1%
0 164592
8.5%
1 134559
7.0%
3 100001
 
5.2%
5 98462
 
5.1%
4 96559
 
5.0%
6 77185
 
4.0%
Other values (6) 296739
15.4%

Most occurring categories

Value Count Frequency (%)
(unknown) 1928200
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
2 288499
15.0%
_ 281656
14.6%
a 195894
10.2%
t 194054
10.1%
0 164592
8.5%
1 134559
7.0%
3 100001
 
5.2%
5 98462
 
5.1%
4 96559
 
5.0%
6 77185
 
4.0%
Other values (6) 296739
15.4%

Most occurring scripts

Value Count Frequency (%)
(unknown) 1928200
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
2 288499
15.0%
_ 281656
14.6%
a 195894
10.2%
t 194054
10.1%
0 164592
8.5%
1 134559
7.0%
3 100001
 
5.2%
5 98462
 
5.1%
4 96559
 
5.0%
6 77185
 
4.0%
Other values (6) 296739
15.4%

Most occurring blocks

Value Count Frequency (%)
(unknown) 1928200
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
2 288499
15.0%
_ 281656
14.6%
a 195894
10.2%
t 194054
10.1%
0 164592
8.5%
1 134559
7.0%
3 100001
 
5.2%
5 98462
 
5.1%
4 96559
 
5.0%
6 77185
 
4.0%
Other values (6) 296739
15.4%
Distinct 443
Distinct (%) 0.2%
Missing 0
Missing (%) 0.0%
Memory size 1.5 MiB
2025-04-28T20:40:10.802018 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 15
Median length 12
Mean length 5.4460047
Min length 2

Characters and Unicode

Total characters 1056819
Distinct characters 35
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row CYP2E1
2nd row CYP2A6
3rd row PLD1
4th row SOD1
5th row HK1
Value Count Frequency (%)
ids 1970
 
1.0%
slc8a1 1779
 
0.9%
ugt1a1 1773
 
0.9%
gls 1576
 
0.8%
slc6a2 1576
 
0.8%
ugt1a9 1379
 
0.7%
cyp2c9 1379
 
0.7%
ugt1a8 1379
 
0.7%
pou2f2 1379
 
0.7%
pld1 1262
 
0.7%
Other values (433) 178602
92.0%
2025-04-28T20:40:11.321000 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
A 124360
 
11.8%
1 112011
 
10.6%
C 107960
 
10.2%
S 79716
 
7.5%
L 66431
 
6.3%
2 60866
 
5.8%
P 53189
 
5.0%
T 43364
 
4.1%
D 31901
 
3.0%
R 28035
 
2.7%
Other values (25) 348986
33.0%

Most occurring categories

Value Count Frequency (%)
(unknown) 1056819
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
A 124360
 
11.8%
1 112011
 
10.6%
C 107960
 
10.2%
S 79716
 
7.5%
L 66431
 
6.3%
2 60866
 
5.8%
P 53189
 
5.0%
T 43364
 
4.1%
D 31901
 
3.0%
R 28035
 
2.7%
Other values (25) 348986
33.0%

Most occurring scripts

Value Count Frequency (%)
(unknown) 1056819
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
A 124360
 
11.8%
1 112011
 
10.6%
C 107960
 
10.2%
S 79716
 
7.5%
L 66431
 
6.3%
2 60866
 
5.8%
P 53189
 
5.0%
T 43364
 
4.1%
D 31901
 
3.0%
R 28035
 
2.7%
Other values (25) 348986
33.0%

Most occurring blocks

Value Count Frequency (%)
(unknown) 1056819
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
A 124360
 
11.8%
1 112011
 
10.6%
C 107960
 
10.2%
S 79716
 
7.5%
L 66431
 
6.3%
2 60866
 
5.8%
P 53189
 
5.0%
T 43364
 
4.1%
D 31901
 
3.0%
R 28035
 
2.7%
Other values (25) 348986
33.0%

actual
Real number (ℝ)

High correlation 

Distinct 76729
Distinct (%) 39.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 138.36131
Minimum 5.2009343
Maximum 12232.902
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.5 MiB
2025-04-28T20:40:11.506838 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 5.2009343
5-th percentile 10.606114
Q1 22.65203
median 46.402756
Q3 106.2966
95-th percentile 538.99159
Maximum 12232.902
Range 12227.701
Interquartile range (IQR) 83.64457

Descriptive statistics

Standard deviation 388.46238
Coefficient of variation (CV) 2.807594
Kurtosis 216.89838
Mean 138.36131
Median Absolute Deviation (MAD) 29.291114
Skewness 11.817311
Sum 26849565
Variance 150903.02
Monotonicity Not monotonic
2025-04-28T20:40:11.688607 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
16.16396997 70
 
< 0.1%
13.00912799 57
 
< 0.1%
17.66900607 55
 
< 0.1%
13.80442339 54
 
< 0.1%
22.58242099 54
 
< 0.1%
15.82444055 51
 
< 0.1%
15.8920675 50
 
< 0.1%
15.72350509 49
 
< 0.1%
28.48592034 49
 
< 0.1%
35.44566108 48
 
< 0.1%
Other values (76719) 193517
99.7%
Value Count Frequency (%)
5.200934277 1
< 0.1%
5.262655185 2
< 0.1%
5.286312991 1
< 0.1%
5.287979996 1
< 0.1%
5.320035545 1
< 0.1%
5.348016531 1
< 0.1%
5.348665504 1
< 0.1%
5.354240401 1
< 0.1%
5.361693787 1
< 0.1%
5.381110847 1
< 0.1%
Value Count Frequency (%)
12232.90214 1
< 0.1%
12008.27225 1
< 0.1%
11986.26306 1
< 0.1%
11941.34582 1
< 0.1%
11850.44007 1
< 0.1%
11817.71133 1
< 0.1%
11812.5872 1
< 0.1%
11549.45386 1
< 0.1%
11482.28835 1
< 0.1%
11401.24467 1
< 0.1%

prediction
Real number (ℝ)

High correlation 

Distinct 176989
Distinct (%) 91.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 138.36131
Minimum -628.79359
Maximum 11524.11
Zeros 0
Zeros (%) 0.0%
Negative 701
Negative (%) 0.4%
Memory size 1.5 MiB
2025-04-28T20:40:11.863881 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum -628.79359
5-th percentile 10.784549
Q1 23.947807
median 47.498616
Q3 107.20521
95-th percentile 537.38797
Maximum 11524.11
Range 12152.903
Interquartile range (IQR) 83.257406

Descriptive statistics

Standard deviation 376.83315
Coefficient of variation (CV) 2.7235443
Kurtosis 209.68467
Mean 138.36131
Median Absolute Deviation (MAD) 29.48177
Skewness 11.579905
Sum 26849565
Variance 142003.22
Monotonicity Not monotonic
2025-04-28T20:40:12.044382 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
28.43981028 9
 
< 0.1%
28.3649831 9
 
< 0.1%
94.72998205 9
 
< 0.1%
28.82278141 9
 
< 0.1%
51.14005072 9
 
< 0.1%
91.54748312 9
 
< 0.1%
84.97984735 9
 
< 0.1%
30.79830258 9
 
< 0.1%
36.94538579 9
 
< 0.1%
25.12841837 9
 
< 0.1%
Other values (176979) 193964
> 99.9%
Value Count Frequency (%)
-628.7935934 1
< 0.1%
-616.1722196 1
< 0.1%
-556.0704121 2
< 0.1%
-473.2518679 1
< 0.1%
-451.3253869 1
< 0.1%
-441.9756527 1
< 0.1%
-420.2652646 1
< 0.1%
-417.2540298 1
< 0.1%
-402.236643 1
< 0.1%
-388.1981059 1
< 0.1%
Value Count Frequency (%)
11524.1099 1
< 0.1%
11373.17708 1
< 0.1%
11047.70826 1
< 0.1%
10950.32993 1
< 0.1%
10928.08893 1
< 0.1%
10856.4793 1
< 0.1%
10775.93164 1
< 0.1%
10672.77182 1
< 0.1%
10669.51316 1
< 0.1%
10661.02157 1
< 0.1%

residual
Real number (ℝ)

High correlation 

Distinct 176989
Distinct (%) 91.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean -3.8844302 × 10-14
Minimum -3566.9962
Maximum 6094.7571
Zeros 0
Zeros (%) 0.0%
Negative 106052
Negative (%) 54.7%
Memory size 1.5 MiB
2025-04-28T20:40:12.224132 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum -3566.9962
5-th percentile -62.34314
Q1 -8.9726341
median -0.78726069
Q3 6.984796
95-th percentile 59.324891
Maximum 6094.7571
Range 9661.7534
Interquartile range (IQR) 15.95743

Descriptive statistics

Standard deviation 94.338726
Coefficient of variation (CV) -2.4286375 × 1015
Kurtosis 285.53276
Mean -3.8844302 × 10-14
Median Absolute Deviation (MAD) 7.9763929
Skewness 4.1824022
Sum -7.1195245 × 10-9
Variance 8899.7953
Monotonicity Not monotonic
2025-04-28T20:40:12.405343 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
-2.978719658 9
 
< 0.1%
-3.985364064 9
 
< 0.1%
10.20108708 9
 
< 0.1%
-11.83414315 9
 
< 0.1%
13.3048253 9
 
< 0.1%
5.948151605 9
 
< 0.1%
21.69867072 9
 
< 0.1%
5.368534417 9
 
< 0.1%
-5.492569319 9
 
< 0.1%
-1.357155355 9
 
< 0.1%
Other values (176979) 193964
> 99.9%
Value Count Frequency (%)
-3566.996238 1
< 0.1%
-3478.386128 1
< 0.1%
-3161.473045 1
< 0.1%
-2579.662309 1
< 0.1%
-2219.270274 1
< 0.1%
-1972.140323 1
< 0.1%
-1971.806054 1
< 0.1%
-1969.435409 1
< 0.1%
-1968.602113 1
< 0.1%
-1956.043967 1
< 0.1%
Value Count Frequency (%)
6094.757115 1
< 0.1%
3900.076232 1
< 0.1%
3535.515059 1
< 0.1%
3440.438665 1
< 0.1%
2853.126021 1
< 0.1%
2741.21263 1
< 0.1%
2648.913866 1
< 0.1%
2606.186316 1
< 0.1%
2560.685639 2
< 0.1%
2492.359834 1
< 0.1%

fdr
Real number (ℝ)

High correlation 

Distinct 877
Distinct (%) 0.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 0.11189482
Minimum 8.0389847 × 10-19
Maximum 0.99290141
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.5 MiB
2025-04-28T20:40:12.585721 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 8.0389847 × 10-19
5-th percentile 8.1872198 × 10-15
Q1 9.6305709 × 10-9
median 0.00014946456
Q3 0.051977166
95-th percentile 0.75354244
Maximum 0.99290141
Range 0.99290141
Interquartile range (IQR) 0.051977156

Descriptive statistics

Standard deviation 0.23834634
Coefficient of variation (CV) 2.1300928
Kurtosis 4.010801
Mean 0.11189482
Median Absolute Deviation (MAD) 0.00014946456
Skewness 2.2731919
Sum 21713.637
Variance 0.056808977
Monotonicity Not monotonic
2025-04-28T20:40:12.767070 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
5.240879432 × 10-8 2167
 
1.1%
0.8829791563 2130
 
1.1%
8.486180637 × 10-5 1773
 
0.9%
4.900735929 × 10-6 1182
 
0.6%
0.7188213464 1182
 
0.6%
1.739125389 × 10-7 1182
 
0.6%
2.879599495 × 10-12 985
 
0.5%
0.04116807971 788
 
0.4%
0.0164760382 788
 
0.4%
0.000160017519 591
 
0.3%
Other values (867) 181286
93.4%
Value Count Frequency (%)
8.038984721 × 10-19 197
0.1%
1.691303207 × 10-18 197
0.1%
2.64633239 × 10-18 197
0.1%
4.803742965 × 10-18 394
0.2%
5.485955076 × 10-18 197
0.1%
6.501741943 × 10-18 197
0.1%
9.601054928 × 10-18 197
0.1%
1.826470625 × 10-17 197
0.1%
2.56269909 × 10-17 197
0.1%
3.141920711 × 10-17 394
0.2%
Value Count Frequency (%)
0.9929014061 197
 
0.1%
0.9913209402 197
 
0.1%
0.9866744108 591
0.3%
0.979866654 197
 
0.1%
0.9769315836 197
 
0.1%
0.9713340928 197
 
0.1%
0.967187453 80
 
< 0.1%
0.9647886324 197
 
0.1%
0.9598986841 80
 
< 0.1%
0.9558857709 197
 
0.1%

f_statistic
Real number (ℝ)

High correlation 

Distinct 959
Distinct (%) 0.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 25.312626
Minimum 7.9357806 × 10-5
Maximum 118.115
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.5 MiB
2025-04-28T20:40:12.939056 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 7.9357806 × 10-5
5-th percentile 0.13191008
Q1 4.3826629
median 16.375423
Q3 39.172581
95-th percentile 79.001213
Maximum 118.115
Range 118.11492
Interquartile range (IQR) 34.789918

Descriptive statistics

Standard deviation 25.844862
Coefficient of variation (CV) 1.0210265
Kurtosis 0.55344799
Mean 25.312626
Median Absolute Deviation (MAD) 14.593443
Skewness 1.145873
Sum 4912016.3
Variance 667.95688
Monotonicity Not monotonic
2025-04-28T20:40:13.121423 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
17.6066331 1773
 
0.9%
0.0276985382 1773
 
0.9%
34.86906227 1773
 
0.9%
0.1717596018 1182
 
0.6%
24.06743353 1182
 
0.6%
31.91198938 985
 
0.5%
60.94618329 985
 
0.5%
6.532261215 788
 
0.4%
4.745979548 788
 
0.4%
55.470904 394
 
0.2%
Other values (949) 182431
94.0%
Value Count Frequency (%)
7.935780645 × 10-5 197
0.1%
0.0001477381937 197
0.1%
0.0003700914575 394
0.2%
0.0004637703998 197
0.1%
0.0009930464802 197
0.1%
0.001326756269 197
0.1%
0.001993672622 197
0.1%
0.002616872536 80
 
< 0.1%
0.002975347702 197
0.1%
0.003842236844 80
 
< 0.1%
Value Count Frequency (%)
118.115003 197
0.1%
113.5721515 197
0.1%
110.9068887 197
0.1%
108.1158634 197
0.1%
107.4835001 197
0.1%
106.5162376 197
0.1%
105.5249249 197
0.1%
103.9309277 197
0.1%
101.6288633 197
0.1%
100.2938413 197
0.1%

f_pvalue
Real number (ℝ)

High correlation 

Distinct 959
Distinct (%) 0.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 0.10278914
Minimum 8.1610273 × 10-22
Maximum 0.99290141
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.5 MiB
2025-04-28T20:40:13.302219 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 8.1610273 × 10-22
5-th percentile 4.1557564 × 10-16
Q1 2.4148636 × 10-9
median 7.4851824 × 10-5
Q3 0.039035239
95-th percentile 0.71685318
Maximum 0.99290141
Range 0.99290141
Interquartile range (IQR) 0.039035237

Descriptive statistics

Standard deviation 0.22758649
Coefficient of variation (CV) 2.2141104
Kurtosis 4.7567022
Mean 0.10278914
Median Absolute Deviation (MAD) 7.4851824 × 10-5
Skewness 2.4104604
Sum 19946.643
Variance 0.051795612
Monotonicity Not monotonic
2025-04-28T20:40:13.481680 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
4.123354768 × 10-5 1773
 
0.9%
0.8679919686 1773
 
0.9%
1.542928473 × 10-8 1773
 
0.9%
0.6790082437 1182
 
0.6%
1.957248679 × 10-6 1182
 
0.6%
5.649688463 × 10-8 985
 
0.5%
3.507978814 × 10-13 985
 
0.5%
0.01135545391 788
 
0.4%
0.03056621829 788
 
0.4%
2.991004298 × 10-12 394
 
0.2%
Other values (949) 182431
94.0%
Value Count Frequency (%)
8.161027291 × 10-22 197
0.1%
3.433958916 × 10-21 197
0.1%
8.059521796 × 10-21 197
0.1%
1.985479129 × 10-20 197
0.1%
2.438335113 × 10-20 197
0.1%
3.341543539 × 10-20 197
0.1%
4.620312974 × 10-20 197
0.1%
7.797449455 × 10-20 197
0.1%
1.668779009 × 10-19 197
0.1%
2.6016043 × 10-19 197
0.1%
Value Count Frequency (%)
0.9929014061 197
0.1%
0.9903145697 197
0.1%
0.9846711039 394
0.2%
0.9828406414 197
0.1%
0.9748929422 197
0.1%
0.9709810073 197
0.1%
0.964431531 197
0.1%
0.9593324877 80
 
< 0.1%
0.9565554087 197
0.1%
0.9507327187 80
 
< 0.1%

Interactions

2025-04-28T20:40:06.468707 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:01.707707 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:02.785938 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:03.824543 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:04.735835 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:05.610505 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:06.608984 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:01.955908 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:02.953915 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:03.983913 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:04.879104 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:05.758438 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:06.752890 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:02.122406 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:03.117079 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:04.140604 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:05.059979 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:05.903435 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:06.888000 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:02.288373 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:03.285808 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:04.296942 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:05.193610 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:06.047579 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:07.023618 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:02.447884 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:03.446806 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:04.441789 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:05.330271 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:06.187830 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:07.165560 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:02.627429 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:03.613375 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:04.592549 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:05.470597 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:40:06.332260 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Correlations

2025-04-28T20:40:13.597264 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
actual prediction residual fdr f_statistic f_pvalue
actual 1.000 0.970 0.243 -0.038 -0.010 -0.037
prediction 0.970 1.000 -0.000 -0.039 -0.011 -0.039
residual 0.243 -0.000 1.000 0.000 -0.000 0.000
fdr -0.038 -0.039 0.000 1.000 -0.447 0.999
f_statistic -0.010 -0.011 -0.000 -0.447 1.000 -0.431
f_pvalue -0.037 -0.039 0.000 0.999 -0.431 1.000
2025-04-28T20:40:13.744829 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
actual prediction residual fdr f_statistic f_pvalue
actual 1.000 0.952 0.117 0.042 -0.042 0.042
prediction 0.952 1.000 -0.105 0.043 -0.043 0.043
residual 0.117 -0.105 1.000 -0.004 0.004 -0.004
fdr 0.042 0.043 -0.004 1.000 -1.000 1.000
f_statistic -0.042 -0.043 0.004 -1.000 1.000 -1.000
f_pvalue 0.042 0.043 -0.004 1.000 -1.000 1.000
2025-04-28T20:40:14.077734 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
actual prediction residual fdr f_statistic f_pvalue
actual 1.000 0.827 0.089 0.029 -0.029 0.029
prediction 0.827 1.000 -0.084 0.030 -0.030 0.030
residual 0.089 -0.084 1.000 -0.003 0.003 -0.003
fdr 0.029 0.030 -0.003 1.000 -0.999 1.000
f_statistic -0.029 -0.030 0.003 -0.999 1.000 -1.000
f_pvalue 0.029 0.030 -0.003 1.000 -1.000 1.000
2025-04-28T20:40:14.233052 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
nichd actual prediction residual fdr f_statistic f_pvalue
nichd 1.000 0.051 0.057 0.042 0.050 0.038 0.050
actual 0.051 1.000 0.921 0.529 0.087 0.101 0.055
prediction 0.057 0.921 1.000 0.452 0.119 0.134 0.099
residual 0.042 0.529 0.452 1.000 0.034 0.039 0.025
fdr 0.050 0.087 0.119 0.034 1.000 0.561 0.986
f_statistic 0.038 0.101 0.134 0.039 0.561 1.000 0.545
f_pvalue 0.050 0.055 0.099 0.025 0.986 0.545 1.000

Missing values

2025-04-28T20:40:07.357912 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
A simple visualization of nullity by column.
2025-04-28T20:40:07.695819 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

sample nichd probe gene_symbol actual prediction residual fdr f_statistic f_pvalue
0 GSM228562 early_adolescence 1431_at CYP2E1 17.435910 17.843241 -0.407332 1.086231e-16 94.430724 1.874626e-18
1 GSM228562 early_adolescence 1494_f_at CYP2A6 54.283612 54.710144 -0.426532 1.351243e-01 2.606134 1.080680e-01
2 GSM228562 early_adolescence 177_at PLD1 60.859872 43.915844 16.944027 5.697088e-01 0.414802 5.202989e-01
3 GSM228562 early_adolescence 200642_at SOD1 962.511454 1352.964044 -390.452591 1.186995e-01 2.827203 9.428051e-02
4 GSM228562 early_adolescence 200697_at HK1 1149.324116 1041.046330 108.277786 9.913209e-01 0.000148 9.903146e-01
5 GSM228562 early_adolescence 200737_at PGK1 684.484587 636.639791 47.844796 1.028237e-09 44.842116 2.223395e-10
6 GSM228562 early_adolescence 200738_s_at PGK1 2130.546845 2119.693853 10.852992 1.297986e-09 44.236963 2.859391e-10
7 GSM228562 early_adolescence 200768_s_at MAT2A 800.920865 577.009992 223.910873 2.005464e-08 37.358238 5.252647e-09
8 GSM228562 early_adolescence 200769_s_at MAT2A 121.135421 128.730212 -7.594791 9.805173e-07 27.818584 3.523726e-07
9 GSM228562 early_adolescence 200824_at GSTP1 724.555947 601.232991 123.322957 2.475873e-01 1.588881 2.089923e-01
sample nichd probe gene_symbol actual prediction residual fdr f_statistic f_pvalue
194044 Hyb9 early_childhood 243951_at ABCB1 23.040087 31.467560 -8.427473 2.673157e-10 48.378290 5.183241e-11
194045 Hyb9 early_childhood 244006_at POU2F1 7.624479 9.991895 -2.367417 6.293067e-05 18.301488 2.952107e-05
194046 Hyb9 early_childhood 244256_at CACNA1E 7.068173 7.671826 -0.603653 1.302867e-03 11.740490 7.457645e-04
194047 Hyb9 early_childhood 244266_at AKR1C2 35.022784 38.475372 -3.452588 7.949812e-10 45.555256 1.654453e-10
194048 Hyb9 early_childhood 244353_s_at SLC2A12 48.465743 29.016493 19.449250 1.762265e-08 37.677070 4.579887e-09
194049 Hyb9 early_childhood 244606_at ATP1A1 32.739583 22.997984 9.741599 3.573249e-02 5.012296 2.629805e-02
194050 Hyb9 early_childhood 244620_at SLC8A1 9.519466 7.343041 2.176426 2.113433e-05 20.754443 9.191528e-06
194051 Hyb9 early_childhood 37950_at PREP 121.472120 136.938438 -15.466318 1.825304e-03 11.025615 1.072604e-03
194052 Hyb9 early_childhood 40665_at FMO3 6.599359 5.944444 0.654915 4.236634e-17 98.008074 5.591237e-19
194053 Hyb9 early_childhood 48808_at DHFR 84.936420 40.915111 44.021308 3.299372e-03 9.799527 2.013859e-03